Discovering Potential Clinical Profiles of Multiple Sclerosis from Clinical and Pathological Free Text Data with Constrained Non-negative Matrix Factorization

نویسندگان

  • Jacopo Acquarelli
  • Monica Bianchini
  • Elena Marchiori
چکیده

Constrained non-negative matrix factorization (CNMF) is an effective machine learning technique to cluster documents in the presence of class label constraints. In this work, we provide a novel application of this technique in research on neuro-degenerative diseases. Specifically, we consider a dataset of documents from the Netherlands Brain Bank containing free text describing clinical and pathological information about donors affected by Multiple Sclerosis. The goal is to use CNMF for identifying clinical profiles with pathological information as constraints. After pre-processing the documents by means of standard filtering techniques, a feature representation of the documents in terms of bi-grams is constructed. The high dimensional feature space is reduced by applying a trimming procedure. The resulting datasets of clinical and pathological bi-grams are then clustered using non-negative matrix factorization (NMF) and, next, clinical data are clustered using CNMF with constraints induced by the clustering of pathological data. Results indicate the presence of interesting clinical profiles, for instance related to vision or movement problems. In particular, the use of CNMF leads to the identification of a clinical profile related to diabetes mellitus. Pathological characteristics and duration of disease of the identified profiles are analysed. Although highly promising, results of this investigation should be interpreted with care due to the relatively small size of the considered datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new approach for building recommender system using non negative matrix factorization method

Nonnegative Matrix Factorization is a new approach to reduce data dimensions. In this method, by applying the nonnegativity of the matrix data, the matrix is ​​decomposed into components that are more interrelated and divide the data into sections where the data in these sections have a specific relationship. In this paper, we use the nonnegative matrix factorization to decompose the user ratin...

متن کامل

Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization

This work proposes a new algorithm for automated and simultaneous phenotyping of multiple co–occurring medical conditions, also referred as comorbidities, using clinical notes from the electronic health records (EHRs). A basic latent factor estimation technique of non-negative matrix factorization (NMF) is augmented with domain specific constraints to obtain sparse latent factors that are ancho...

متن کامل

Automated tissue segmentation and blind recovery of (1)H MRS imaging spectral patterns of normal and diseased human brain.

Constrained non-negative matrix factorization (cNMF) with iterative data selection is described and demonstrated as a data analysis method for fast and automatic recovery of biochemically meaningful and diagnostically specific spectral patterns of the human brain from (1)H MRS imaging ((1)H MRSI) data. To achieve this goal, cNMF decomposes in vivo multidimensional (1)H MRSI data into two non-ne...

متن کامل

Iterative Weighted Non-smooth Non-negative Matrix Factorization for Face Recognition

Non-negative Matrix Factorization (NMF) is a part-based image representation method. It comes from the intuitive idea that entire face image can be constructed by combining several parts. In this paper, we propose a framework for face recognition by finding localized, part-based representations, denoted “Iterative weighted non-smooth non-negative matrix factorization” (IWNS-NMF). A new cost fun...

متن کامل

Predictive ability of 18F-fluorodeoxyglucose positron emission tomography/computed tomography for pathological complete response and prognosis after neoadjuvant chemotherapy in triple-negative breast cancer patients

Objective The mortality of patients with locally advanced triple-negative breast cancer (TNBC) is high, and pathological complete response (pCR) to neoadjuvant chemotherapy (NAC) is associated with improved prognosis. This retrospective study was designed and powered to investigate the ability of 18F-fluorodeoxyglucose positron emission tomography/computed tomography (FDG-PET/CT) to predict pat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016